AITopics | transformer-based framework

Collaborating Authors

transformer-based framework

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

POIFormer: A Transformer-Based Framework for Accurate and Scalable Point-of-Interest Attribution

Saxena, Nripsuta Ani, Hsu, Shang-Ling, Shetty, Mehul, Alkhadra, Omar, Shahabi, Cyrus, Horn, Abigail L.

arXiv.org Artificial IntelligenceJul-15-2025

Accurately attributing user visits to specific Points of Interest (POIs) is a foundational task for mobility analytics, personalized services, marketing and urban planning. However, POI attribution remains challenging due to GPS inaccuracies, typically ranging from 2 to 20 meters in real-world settings, and the high spatial density of POIs in urban environments, where multiple venues can coexist within a small radius (e.g., over 50 POIs within a 100-meter radius in dense city centers). Relying on proximity is therefore often insufficient for determining which POI was actually visited. We introduce \textsf{POIFormer}, a novel Transformer-based framework for accurate and efficient POI attribution. Unlike prior approaches that rely on limited spatiotemporal, contextual, or behavioral features, \textsf{POIFormer} jointly models a rich set of signals, including spatial proximity, visit timing and duration, contextual features from POI semantics, and behavioral features from user mobility and aggregated crowd behavior patterns--using the Transformer's self-attention mechanism to jointly model complex interactions across these dimensions. By leveraging the Transformer to model a user's past and future visits (with the current visit masked) and incorporating crowd-level behavioral patterns through pre-computed KDEs, \textsf{POIFormer} enables accurate, efficient attribution in large, noisy mobility datasets. Its architecture supports generalization across diverse data sources and geographic contexts while avoiding reliance on hard-to-access or unavailable data layers, making it practical for real-world deployment. Extensive experiments on real-world mobility datasets demonstrate significant improvements over existing baselines, particularly in challenging real-world settings characterized by spatial noise and dense POI clustering.

data mining, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2507.09137

Country:

Europe (0.68)
North America > United States > California > Los Angeles County > Los Angeles (0.48)

Genre: Research Report > New Finding (1.00)

Industry:

Consumer Products & Services (0.93)
Retail (0.93)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

A Transformer-Based Framework for Greek Sign Language Production using Extended Skeletal Motion Representations

Pratikaki, Chrysa, Filntisis, Panagiotis, Katsamanis, Athanasios, Roussos, Anastasios, Maragos, Petros

arXiv.org Artificial IntelligenceMar-4-2025

Building on To address communication barriers between the DHH (Deaf and insights from previous research, we propose a deep learning model Hard-of-Hearing) and the hearing communities, the field of Sign for Sign Language Production (SLP), which to our knowledge is Language Processing has emerged at the intersection of linguistics, the first attempt on Greek SLP. We tackle this task by utilizing a computer vision, and machine learning. Sign Language Processing transformer-based architecture that enables the translation from encompasses a variety of tasks aimed at bridging the gap between text input to human pose keypoints, and the opposite. We evaluate DHH and hearing communities by enabling the automatic translation, the effectiveness of the proposed pipeline on the Greek SL dataset and generation of sign language. The most critical components Elementary23, through a series of comparative analyses and ablation of an effective sign language system are Sign Language Translation studies. Our pipeline's components, which include data-driven (SLT), and Sign Language Production (SLP). In this paper, we gloss generation, training through video to text translation and a primarily focus on Sign Language Production (SLP).

landmark, language production, sign language production, (14 more...)

arXiv.org Artificial Intelligence

2503.02421

Country:

Europe > Greece > Ionian Islands > Corfu (0.06)
Europe > Greece > Attica > Athens (0.05)
North America > United States > New York (0.04)

Genre: Research Report (0.50)

Industry: Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Full Transformer-based Framework for Automatic Pain Estimation using Videos

Gkikas, Stefanos, Tsiknakis, Manolis

arXiv.org Artificial IntelligenceDec-19-2024

The automatic estimation of pain is essential in designing an optimal pain management system offering reliable assessment and reducing the suffering of patients. In this study, we present a novel full transformer-based framework consisting of a Transformer in Transformer (TNT) model and a Transformer leveraging cross-attention and self-attention blocks. Elaborating on videos from the BioVid database, we demonstrate state-of-the-art performances, showing the efficacy, efficiency, and generalization capability across all the primary pain estimation tasks.

feature extraction module, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/EMBC40787.2023.10340872

2412.15095

Country:

Europe > Greece (0.04)
North America > United States (0.04)
Europe > Switzerland (0.04)

Genre: Research Report > New Finding (0.35)

Industry:

Health & Medicine > Consumer Health (0.87)
Health & Medicine > Therapeutic Area > Neurology (0.66)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

PINNsFormer: A Transformer-Based Framework For Physics-Informed Neural Networks

Zhao, Zhiyuan, Ding, Xueying, Prakash, B. Aditya

arXiv.org Artificial IntelligenceOct-3-2023

Physics-Informed Neural Networks (PINNs) have emerged as a promising deep learning framework for approximating numerical solutions to partial differential equations (PDEs). However, conventional PINNs, relying on multilayer perceptrons (MLP), neglect the crucial temporal dependencies inherent in practical physics systems and thus fail to propagate the initial condition constraints globally and accurately capture the true solutions under various scenarios. In this paper, we introduce a novel Transformer-based framework, termed PINNsFormer, designed to address this limitation. PINNsFormer can accurately approximate PDE solutions by utilizing multi-head attention mechanisms to capture temporal dependencies. PINNsFormer transforms point-wise inputs into pseudo sequences and replaces point-wise PINNs loss with a sequential loss. Additionally, it incorporates a novel activation function, Wavelet, which anticipates Fourier decomposition through deep neural networks. Empirical results demonstrate that PINNsFormer achieves superior generalization ability and accuracy across various scenarios, including PINNs failure modes and high-dimensional PDEs. Moreover, PINNsFormer offers flexibility in integrating existing learning schemes for PINNs, further enhancing its performance.

physics-informed neural network, pinnsformer, transformer-based framework

arXiv.org Artificial Intelligence

2307.11833

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Transformer-based Framework For Multi-variate Time Series: A Remaining Useful Life Prediction Use Case

Ogunfowora, Oluwaseyi, Najjaran, Homayoun

arXiv.org Artificial IntelligenceAug-29-2023

In recent times, Large Language Models (LLMs) have captured a global spotlight and revolutionized the field of Natural Language Processing. One of the factors attributed to the effectiveness of LLMs is the model architecture used for training, transformers. Transformer models excel at capturing contextual features in sequential data since time series data are sequential, transformer models can be leveraged for more efficient time series data prediction. The field of prognostics is vital to system health management and proper maintenance planning. A reliable estimation of the remaining useful life (RUL) of machines holds the potential for substantial cost savings. This includes avoiding abrupt machine failures, maximizing equipment usage, and serving as a decision support system (DSS). This work proposed an encoder-transformer architecture-based framework for multivariate time series prediction for a prognostics use case. We validated the effectiveness of the proposed framework on all four sets of the C-MAPPS benchmark dataset for the remaining useful life prediction task. To effectively transfer the knowledge and application of transformers from the natural language domain to time series, three model-specific experiments were conducted. Also, to enable the model awareness of the initial stages of the machine life and its degradation path, a novel expanding window method was proposed for the first time in this work, it was compared with the sliding window method, and it led to a large improvement in the performance of the encoder transformer model. Finally, the performance of the proposed encoder-transformer model was evaluated on the test dataset and compared with the results from 13 other state-of-the-art (SOTA) models in the literature and it outperformed them all with an average performance increase of 137.65% over the next best model across all the datasets.

multi-variate time sery, transformer-based framework, useful life prediction use case

arXiv.org Artificial Intelligence

2308.09884

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

A Transformer-based Framework for POI-level Social Post Geolocation

Li, Menglin, Lim, Kwan Hui, Guo, Teng, Liu, Junhua

arXiv.org Artificial IntelligenceOct-26-2022

POI-level geo-information of social posts is critical to many location-based applications and services. However, the multi-modality, complexity and diverse nature of social media data and their platforms limit the performance of inferring such fine-grained locations and their subsequent applications. To address this issue, we present a transformer-based general framework, which builds upon pre-trained language models and considers non-textual data, for social post geolocation at the POI level. To this end, inputs are categorized to handle different social data, and an optimal combination strategy is provided for feature representations. Moreover, a uniform representation of hierarchy is proposed to learn temporal information, and a concatenated version of encodings is employed to capture feature-wise positions better. Experimental results on various social datasets demonstrate that three variants of our proposed framework outperform multiple state-of-art baselines by a large margin in terms of accuracy and distance error metrics.

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2211.01336

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Singapore (0.05)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)
(3 more...)

Genre: Research Report (0.82)

Industry: Information Technology > Services (0.49)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback